42 research outputs found

    Let's have a chat! A Conversation with ChatGPT: Technology, Applications, and Limitations

    Full text link
    The emergence of an AI-powered chatbot that can generate human-like sentences and write coherent essays has caught the world's attention. This paper discusses the historical overview of chatbots and the technology behind Chat Generative Pre-trained Transformer, better known as ChatGPT. Moreover, potential applications of ChatGPT in various domains, including healthcare, education, and research, are highlighted. Despite promising results, there are several privacy and ethical concerns surrounding ChatGPT. In addition, we highlight some of the important limitations of the current version of ChatGPT. We also ask ChatGPT to provide its point of view and present its responses to several questions we attempt to answer.Comment: This manuscript has been accepted by Artificial Intelligence and Applications (AIA, ISSN: 2811-0854), 202

    NFTGAN: Non-Fungible Token Art Generation Using Generative Adversarial Networks

    Get PDF
    Digital arts have gained an unprecedented level of popularity with the emergence of non-fungible tokens (NFTs). NFTs are cryptographic assets that are stored on blockchain networks and represent a digital certificate of ownership that cannot be forged. NFTs can be incorporated into a smart contract which allows the owner to benefit from a future sale percentage. While digital art producers can benefit immensely with NFTs, their production is time consuming. Therefore, this paper explores the possibility of using generative adversarial networks (GANs) for automatic generation of digital arts. GANs are deep learning architectures that are widely and effectively used for synthesis of audio, images, and video contents. However, their application to NFT arts have been limited. In this paper, a GAN-based architecture is implemented and evaluated for novel NFT-style digital arts generation. Results from the qualitative case study indicate that the generated artworks are comparable to the real samples in terms of being interesting and inspiring and they were judged to be more innovative than real samples

    Using Self-labeling and Co-Training to Enhance Bots Labeling in Twitter

    Get PDF
    The rapid evolution in social bots have required efficient solutions to detect them in real-time. In fact, obtaining labeled stream datasets that contains variety of bots is essential for this classification task. Despite that, it is one of the challenging issues for this problem. Accordingly, finding appropriate techniques to label unlabeled data is vital to enhance bot detection. In this paper, we investigate two labeling techniques for semi-supervised learning to evaluate different performances for bot detection. We examine self-training and co-training. Our results show that self-training with maximum confidence outperformed by achieving a score of 0.856 for F1 measure and 0.84 for AUC. Random Forest classifier in both techniques performed better compared to other classifiers. In co-training, using single view approach with random forest classifier using less features achieved slightly better compared to single view with more features. Using multi-view of features in co-training in general achieved similar results for different splits

    Cost Analysis of Query-Anonymity on the Internet of Things

    Get PDF
    A necessary function of the Internet of Things (IoT) is to sense the real-world from the fabric of everyday environments. Wireless Sensor Networks (WSNs) are widely deployed as part of IoT for environmental sensing, industrial monitoring, health care, and military purposes. Traditional WSNs are limited in terms of their management and usage model. As an alternative paradigm for WSN management, the sensor-cloud virtualizes physical sensors. While this model has many benefits, there are privacy issues that are not yet addressed. The query-anonymity arises when the client wants the destination physical sensor-node to be indistinguishable from other potential destinations. In particular, we consider the k-anonymous query scheme in which the query destination is indistinguishable from other k-1 probable destinations, where k is the offered level-of-anonymity. Moreover, we are interested in a communication-based approach in which the client is required to send k queries to at least k destinations including the node of interest in order to achieve a level-of-anonymity k. Thus, the communication-cost increases with the level-of-anonymity k. Consequently, there is a natural trade-off between the offered query-anonymity and the incurred communication-cost. The analysis of such trade-off is the main problem we address in this work. We firstly aim at a novel theoretical framework that quantifies the security of a general k-anonymous query scheme. Towards that, we adopt two approaches based on well-known security models namely, ciphertext indistinguishability under chosen plaintext attack (IND-CPA), and information theoretic notion of perfect secrecy. Next, we provide a construction of a secure k-anonymous query scheme, and introduce its detailed design and implementation, including the partition algorithm, anonymity-sets construction methods, query routing algorithm, and querying protocol. Then we establish a set of average-case and worst-case bounds on the cost-anonymity trade-off. We are committed to answer two important questions: what is the communication-cost, on average and in the worst-case, that is necessary? and what is the communication-cost that is sufficient to achieve the required secure query k-anonymity? Finally, we conduct extensive simulations to analyze various performance-anonymity trade-offs to study the average and worst-case bounds on them, and investigate several variations of anonymity-sets constructions methods. Confirming our theoretical analysis, our evaluation results show that the bounds of the average and worst-case cost change from quadratic asymptotic dependence on the network diameter to the same dependence on the level-of-anonymity when the later surpasses the former. Furthermore, most of the obtained bounds on various performance anonymity trade-offs can be expressed precisely in terms of the offered level-of-anonymity and network diameter

    Hybrid feature selection approach to identify optimal features of profile metadata to detect social bots in Twitter

    Get PDF
    The last few years have revealed that social bots in social networks have become more sophisticated in design as they adapt their features to avoid detection systems. The deceptive nature of bots to mimic human users is due to the advancement of artificial intelligence and chatbots, where these bots learn and adjust very quickly. Therefore, finding the optimal features needed to detect them is an area for further investigation. In this paper, we propose a hybrid feature selection (FS) method to evaluate profile metadata features to find these optimal features, which are evaluated using random forest, naïve Bayes, support vector machines, and neural networks. We found that the cross-validation attribute evaluation performance was the best when compared to other FS methods. Our results show that the random forest classifier with six optimal features achieved the best score of 94.3% for the area under the curve. The results maintained overall 89% accuracy, 83.8% precision, and 83.3% recall for the bot class. We found that using four features: favorites_count, verified, statuses_count, and average_tweets_per_day, achieves good performance metrics for bot detection (84.1% precision, 81.2% recall)

    Characteristics of Similar-Context Trending Hashtags in Twitter: A Case Study

    Get PDF
    © 2020, Springer Nature Switzerland AG. Twitter is a popular social networking platform that is widely used in discussing and spreading information on global events. Twitter trending hashtags have been one of the topics for researcher to study and analyze. Understanding the posting behavior patterns as the information flows increase by rapid events can help in predicting future events or detection manipulation. In this paper, we investigate similar-context trending hashtags to characterize general behavior of specific-trend and generic-trend within same context. We demonstrate an analysis to study and compare such trends based on spatial, temporal, content, and user activity. We found that the characteristics of similar-context trends can be used to predict future generic trends with analogous spatiotemporal, content, and user features. Our results show that more than 70% users participate in location-based hashtag belongs to the location of the hashtag. Generic trends aim to have more influence in users to participate than specific trends with geographical context. The retweet ratio in specific trends is higher than generic trends with more than 79%

    On the Impact of Deep Learning and Feature Extraction for Arabic Audio Classification and Speaker Identification

    Get PDF
    In recent times, machine learning and deep learning algorithms have contributed to the advances in audio and speech recognition. Despite the progress, there is limited emphasis on the classification of cantillation audio using deep learning. This paper introduces a dataset containing two labeled styles of cantillation from six reciters. Deep learning architectures including convolutional neural networks (CNN) and deep artificial neural networks (ANN) were used to classify the recitation styles using various spectrogram features. Moreover, the classification of the six reciters was also performed using deep learning. The best performance was achieved using a CNN model and Mel spectrograms resulting in an F1-score of 0.99 on the test set for classifying recitation style and an F1-score of 1.00 on the test set for classifying reciters. The results obtained in this work outperform the existing works in the literature. The paper also discusses the impact of various audio features and deep learning algorithms that apply to audio genre and speaker identification tasks

    The Imitation Game: Detecting Human and AI-Generated Texts in the Era of Large Language Models

    Full text link
    The potential of artificial intelligence (AI)-based large language models (LLMs) holds considerable promise in revolutionizing education, research, and practice. However, distinguishing between human-written and AI-generated text has become a significant task. This paper presents a comparative study, introducing a novel dataset of human-written and LLM-generated texts in different genres: essays, stories, poetry, and Python code. We employ several machine learning models to classify the texts. Results demonstrate the efficacy of these models in discerning between human and AI-generated text, despite the dataset's limited sample size. However, the task becomes more challenging when classifying GPT-generated text, particularly in story writing. The results indicate that the models exhibit superior performance in binary classification tasks, such as distinguishing human-generated text from a specific LLM, compared to the more complex multiclass tasks that involve discerning among human-generated and multiple LLMs. Our findings provide insightful implications for AI text detection while our dataset paves the way for future research in this evolving area

    Real Time Detection of Social Bots on Twitter Using Machine Learning and Apache Kafka

    Get PDF
    Social media networks, like Facebook and Twitter, are increasingly becoming important part of most people\u27s lives. Twitter provides a useful platform for sharing contents, ideas, opinions, and promoting products and election campaigns. Due to the increased popularity, it became vulnerable to malicious attacks caused by social bots. Social bots are automated accounts created for different purposes. They are involved in spreading rumors and false information, cyberbullying, spamming, and manipulating the ecosystem of social network. Most of the social bots detection methods rely on the utilization of offline data for both training and testing. In this paper, we use Apache Kafka, a big data analytics tool to stream data from Twitter API in real time. We use profile information (metadata) as features. A machine learning technique is applied to predict the type of the incoming data (human or bot). In addition, the paper presents technical details of how to configure these different tools

    Suitability of Blockchain for Collaborative Intrusion Detection Systems

    Get PDF
    © 2020 IEEE. Cyber-security is indispensable as malicious incidents are ubiquitous on the Internet. Intrusion Detection Systems have an important role in detecting and thwarting cyber-attacks. However, it is more effective in a centralized system but not in peer-to-peer networks which makes it subject to central point failure, especially in collaborated intrusion detection systems. The novel blockchain technology assures a fully distributed security system through its powerful features of transparency, immutability, decentralization, and provenance. Therefore, in this paper, we investigate and demonstrate several methods of collaborative intrusion detection with blockchain to analyze the suitability and security of blockchain for collaborative intrusion detection systems. We also studied the difference between the existing means of the integration of intrusion detection systems with blockchain and categorized the major vulnerabilities of blockchain with their potential losses and current enhancements for mitigation
    corecore